Generate document body from comments

All the features from markdown and markdown supported within .Rmd documents, I was able to get from within R scripts. Here are some that I tested and use most frequently:

# comments without the extra tick show up like this.  And get included in code blocks
# loading mtcars data
data(mtcars)
# pdf.tbl[, c(2:11)] <-lapply(pdf.tbl[, c(2:11)], function(y) as.numeric(gsub('[^a-zA-Z0-9.]', '', y)))                                       # https://tinyurl.com/ya4ok9tb
# pdf.tbl[is.na(pdf.tbl)] <- ""
# pdf.tbl<-dplyr::mutate_if(pdf.tbl, is.numeric, format_dol_fun)

# data(pdf.tbl)

pdf3<-datatable(data.table(pdf.tbl), options = list(
  searching = FALSE,
  pageLength = 15,
  lengthMenu = c(5, 10, 15, 20)
))
pdf3<-formatCurrency(pdf3,2:11, digits = 0)

pdf3

Messing with data

library('knitr')
mtcars[1:5, 1:4] %>%
  mutate(
    car = row.names(.),
    mpg = color_tile("white", "orange")(mpg),
    cyl = cell_spec(cyl, "html", angle = (1:5)*60, 
                    background = "red", color = "white", align = "center"),
    disp = ifelse(disp > 200,
                  cell_spec(disp, "html", color = "red", bold = T),
                  cell_spec(disp, "html", color = "green", italic = T)),
    hp = color_bar("lightgreen")(hp)
  ) %>%
  select(car, everything()) %>%
  kable("html", escape = F) %>%
  kable_styling("hover", full_width = F) %>%
  column_spec(5, width = "3cm") %>%
  add_header_above(c(" ", "Hello" = 2, "World" = 2))
Hello
World
car mpg cyl disp hp
Mazda RX4 21.0 6 160 110
Mazda RX4 Wag 21.0 6 160 110
Datsun 710 22.8 4 108 93
Hornet 4 Drive 21.4 6 258 110
Hornet Sportabout 18.7 8 360 175

3 ways to print an object

…specifically a data.frame in this case. Ordered from least to most pretty (in my opinion).

print(head(pdf.tbl))
##            Description 2007-08 2008-09 2009-10 2010-11 2011-12 2012-13
## 1     Resident Costs*:      NA      NA      NA      NA      NA      NA
## 2       Tuition & Fees    5622    6030    7530    8736    9472    9842
## 3     Books & Supplies     840     900     960    1030    1078     848
## 4         Room & Board    7292    7528    8046    8460    8708    8970
## 5       Misc. & Travel    2300    2300    1464    1510    1562    1590
## 6 Resident Total Costs   16054   16758   18000   19736   20820   21250
##   2013-14 2014-15 2015-16 2016-17
## 1      NA      NA      NA      NA
## 2   10262   10836   11622   11634
## 3     916     800     840    1006
## 4    9246    9246    9450    9616
## 5    1640    1798    3222    3952
## 6   22064   22680   25134   26208
print(head(mtcars))
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
knitr::kable(head(pdf.tbl))
Description 2007-08 2008-09 2009-10 2010-11 2011-12 2012-13 2013-14 2014-15 2015-16 2016-17
Resident Costs*: NA NA NA NA NA NA NA NA NA NA
Tuition & Fees 5622 6030 7530 8736 9472 9842 10262 10836 11622 11634
Books & Supplies 840 900 960 1030 1078 848 916 800 840 1006
Room & Board 7292 7528 8046 8460 8708 8970 9246 9246 9450 9616
Misc. & Travel 2300 2300 1464 1510 1562 1590 1640 1798 3222 3952
Resident Total Costs 16054 16758 18000 19736 20820 21250 22064 22680 25134 26208
knitr::kable(head(mtcars))
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

including #+ results='asis' chunk option for formatting

knitr::kable(pdf.tbl)
Description 2007-08 2008-09 2009-10 2010-11 2011-12 2012-13 2013-14 2014-15 2015-16 2016-17
Resident Costs*: NA NA NA NA NA NA NA NA NA NA
Tuition & Fees 5622 6030 7530 8736 9472 9842 10262 10836 11622 11634
Books & Supplies 840 900 960 1030 1078 848 916 800 840 1006
Room & Board 7292 7528 8046 8460 8708 8970 9246 9246 9450 9616
Misc. & Travel 2300 2300 1464 1510 1562 1590 1640 1798 3222 3952
Resident Total Costs 16054 16758 18000 19736 20820 21250 22064 22680 25134 26208
Nonresident Costs*: NA NA NA NA NA NA NA NA NA NA
Tuition & Fees 20726 22342 25740 26946 27682 28052 28472 29046 29832 29844
Books & Supplies 840 900 960 1030 1078 848 916 800 840 1006
Room & Board 7292 7528 8046 8460 8708 8970 9246 9246 9450 9616
Misc. & Travel 2300 2300 1464 1510 1562 1590 1640 1798 3746 4662
Nonresident Total Costs 31158 33070 36210 37946 39030 39460 40274 40890 43868 45128
knitr::kable(head(mtcars))
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
knitr::kable(list(head(mtcars), head(mtcars)))
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Plotting

plot(mtcars$mpg, mtcars$disp, col=mtcars$cyl, pch=19)
plot(mtcars$mpg, mtcars$disp, col=mtcars$cyl, pch=19)

We can change the chunk options we would use for a code block using knitr by using a comment that starts with #+. For example, to change the plot size, we can specify #+ fig.width=4, fig.height=4 before plotting.

A new chunk is automatically generated (chunk settings reset) whenever we add document text with #' or change. However, it is possible to specify global chunk options, if desired. chunk options again with #+.
#+ fig.width=4, fig.height=4

two plots

two plots

Small plots test

grad_residence

Small plots often render with strange resolution and relative sizings of labels, axes, etc. The dpi chunk option can be used to fix this. Just be sure to adjust the fig.width and fig.height accordingly.

Bad plot: #+ fig.width=2, fig.height=2

hist(mtcars$mpg)

Good plot: #+ fig.width=4, fig.height=4, dpi=50

hist(mtcars$mpg)

Generate a series of plots from a loop

for(i in 1:ncol(mtcars)) hist(mtcars[,i], breaks=40, xlab='', main=names(mtcars)[i])

Let’s build a random forest model

… and explore some model output. First here’s a big chunk of text from the random forest documentation:

Random forest documentation:
randomForest implements Breiman’s random forest algorithm (based on Breiman and Cutler’s original Fortran code) for classification and regression. It can also be used in unsupervised mode for assessing proximities among data points.

Note:
The forest structure is slightly different between classification and regression. For details on how the trees are stored, see the help page for getTree.

If xtest is given, prediction of the test set is done “in place” as the trees are grown. If ytest is also given, and do.trace is set to some positive integer, then for every do.trace trees, the test set error is printed. Results for the test set is returned in the test component of the resulting randomForest object. For classification, the votes component (for training or test set data) contain the votes the cases received for the classes. If norm.votes=TRUE, the fraction is given, which can be taken as predicted probabilities for the classes.

For large data sets, especially those with large number of variables, calling randomForest via the formula interface is not advised: There may be too much overhead in handling the formula.

The “local” (or casewise) variable importance is computed as follows: For classification, it is the increase in percent of times a case is OOB and misclassified when the variable is permuted. For regression, it is the average increase in squared OOB residuals when the variable is permuted.

# OK. now let's actually use random.forest
library('randomForest')
mtcars$am <- as.factor(mtcars$am)
rf <- randomForest(am~., ntree=100, data=mtcars)

Here’s the resulting confusion matrix on the training data. Not very clear to a non-technical or non-forest savvy audience.

print(rf)
## 
## Call:
##  randomForest(formula = am ~ ., data = mtcars, ntree = 100) 
##                Type of random forest: classification
##                      Number of trees: 100
## No. of variables tried at each split: 3
## 
##         OOB estimate of  error rate: 12.5%
## Confusion matrix:
##    0  1 class.error
## 0 17  2   0.1052632
## 1  2 11   0.1538462

Dynamic comments

We can get fancy and actually dynamically generate some commentary around these results. That is we can auto-fill parts of our document text with objects from the R environment. This could be useful with analyses that involve stochastic elements changing from run to run like random forest. Or any analysis where results are subject to change.


17 cars are correctly classified as 0.
11 cars are correctly classified as 1.
2 cars are misclassified as 1.
2 cars are misclassified as 0.

These numbers were generated by wrapping the R expression to excute into the ticks like so: I don’t know how to write this within a #' comment without evaluating it, so I’m documenting here as a character string:

## #' `r rf$confusion[2,1]` cars are misclassified as 0.


Generate comments in a loop

This is useful if you want to generate lots of text without writing it manually.
#+ results='asis'

for (i in 1:10) {
  rf <- randomForest(am~., ntree=100, data=mtcars)
  cat("iteration ",i, ": ", rf$confusion[1,1], "cars are correctly classified as 0", "\n")
  cat('\n')
}

iteration 1 : 17 cars are correctly classified as 0

iteration 2 : 18 cars are correctly classified as 0

iteration 3 : 16 cars are correctly classified as 0

iteration 4 : 17 cars are correctly classified as 0

iteration 5 : 17 cars are correctly classified as 0

iteration 6 : 17 cars are correctly classified as 0

iteration 7 : 17 cars are correctly classified as 0

iteration 8 : 17 cars are correctly classified as 0

iteration 9 : 17 cars are correctly classified as 0

iteration 10 : 17 cars are correctly classified as 0

Toggle chunk settings globally by R variables

Much like we used R objects to dynamically generate text to print in the document (in the form of comments), we can use R objects to dynamically specify chunk options.
When we set evaluateStuff to TRUE or FALSE, the following 3 chunks will evaluate (or not) as we choose. We can toggle them all with one variable, instead of manually changing the chunk settings with #+ eval=T in the R script multiple times. Simply include the variable you want to execute in the chunk comments with ticks.
Like so: #+ eval=`evaluateStuff`

evaluateStuff <- T
cat('first thing to evaluate')
## first thing to evaluate
cat('second thing to evaluate')
## second thing to evaluate
cat('third thing to evaluate')
## third thing to evaluate


Now let’s just print the code and not evaluate anything.

evaluateStuff <- F
cat('first thing to evaluate')
cat('second thing to evaluate')
cat('third thing to evaluate')

Converting R Script to HTML/PDF

If you did everyhing right, above this is the easy part. Simply render the script as desired with the render function from rmarkdown.
rmarkdown::render('/Users/you/Documents/yourscript.R')